Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Gene data generation method based on generative adversarial network
Yimin CAO, Lei CAI, Jingyang GAO
Journal of Computer Applications    2022, 42 (3): 783-790.   DOI: 10.11772/j.issn.1001-9081.2021040759
Abstract319)   HTML14)    PDF (1786KB)(128)       Save

In deep learning, as the depth of Convolutional Neural Network (CNN) increases, more and more data is required for neural network training, but gene structure variation is a small sample event in large-scale genetic data, resulting in a very shortage of image data of variant genes, which seriously affects the training effect of CNN and causes the problems of poor gene structure variation detection precision and high false positive rate. In order to increase the number of gene structure variation samples and improve the precision of CNN to identify gene structure variation, a gene image data augmentation method was proposed based on GAN (Generative Adversarial Network), namely GeneGAN. Firstly, initial genetic image data was generated by using the Reads stacking method and it was divided into two datasets including variant gene images and non-variant gene images. Secondly, GeneGAN was used to augment the variant image samples to balance the positive and negative datasets. Finally, CNN was used to detect the datasets before and after augmentation, and precision, recall and F1 score were used as measurement indicators. Experimental results show that compared with tradional augmentation method, GAN based augmentation method and feature extraction method, the F1 score of GeneGAN is improved by 1.94 to 17.46 percentage points, verifying that GeneGAN method can improve the precision of CNN to identify gene structure variation.

Table and Figures | Reference | Related Articles | Metrics